In part one of this three part series on sharding and parallelism we’ll explore how to scale your Flax NNX models using JAX's powerful distributed computing capabilities, specifically its SPMD paradigm. If you're coming from PyTorch and have started using JAX and Flax NNX, you know that modern models often outgrow single accelerators. Let’s discuss JAX's approach to parallelism and how NNX integrates with it seamlessly. This episode will cover the "why" and "what" of distributed training, introducing the fundamental concepts of parallelism and the core JAX primitives needed to implement them.
Resources:
Learn more →
Subscribe to Google for Developers →
Speaker: Robert Crowe
|
This is part two of our two episode seri...
See how Gemini 3 writes code and builds ...
Learn to build scalable backend applicat...
AI tools can be super helpful when used ...
This course is Harvard University's intr...
Only 6 values are falsey in JavaScript. ...
Your RSA-2048 encryption isn't as safe a...